home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
NOVA - For the NeXT Workstation
/
NOVA - For the NeXT Workstation.iso
/
Documents
/
FAQ
/
Unix
/
Unix-faq3
< prev
next >
Wrap
Text File
|
1992-12-27
|
27KB
|
615 lines
Archive-name: unix-faq/part3
Version: $Id: part3,v 1.4 92/03/19 14:07:42 tmatimar Exp $
These four articles contain the answers to some Frequently Asked
Questions often seen in comp.unix.questions and comp.unix.shell.
Please don't ask these questions again, they've been answered plenty
of times already - and please don't flame someone just because they may
not have read this particular posting. Thank you.
These articles are divided approximately as follows:
1.*) General questions.
2.*) Relatively basic questions, likely to be asked by beginners.
3.*) Intermediate questions.
4.*) Advanced questions, likely to be asked by people who thought
they already knew all of the answers.
This article includes answers to:
3.1) How do I find out the creation time of a file?
3.2) How do I use "rsh" without having the rsh hang around
until the remote command has completed?
3.3) How do I truncate a file?
3.4) Why doesn't find's "{}" symbol do what I want?
3.5) How do I set the permissions on a symbolic link?
3.6) How do I "undelete" a file?
3.7) How can a process detect if it's running in the background?
3.8) Why doesn't redirecting a loop work as intended? (Bourne shell)
3.9) How do I run 'passwd', 'ftp', 'telnet', 'tip' and other interactive
programs from a shell script or in the background?
3.10) How do I find out the process ID of a program with a particular
name from inside a shell script or C program?
3.11) How do I check the exit status of a remote command
executed via "rsh" ?
3.12) Is it possible to pass shell variable settings into an awk program?
3.13) How do I get rid of zombie processes that persevere?
3.14) How do I get lines from a pipe as they are written instead of
only in larger blocks.
If you're looking for the answer to, say, question 3.5, and want to skip
everything else, you can search ahead for the regular expression "^5)".
While these are all legitimate questions, they seem to crop up in
comp.unix.questions on an annual basis, usually followed by plenty
of replies (only some of which are correct) and then a period of
griping about how the same questions keep coming up. You may also like
to read the monthly article "Answers to Frequently Asked Questions"
in the newsgroup "news.announce.newusers", which will tell you what
"UNIX" stands for.
With the variety of Unix systems in the world, it's hard to guarantee
that these answers will work everywhere. Read your local manual pages
before trying anything suggested here. If you have suggestions or
corrections for any of these answers, please send them to to
tmatimar@nff.ncl.omron.co.jp.
1) How do I find out the creation time of a file?
You can't - it isn't stored anywhere. Files have a last-modified
time (shown by "ls -l"), a last-accessed time (shown by "ls -lu")
and an inode change time (shown by "ls -lc"). The latter is often
referred to as the "creation time" - even in some man pages - but
that's wrong; it's also set by such operations as mv, ln,
chmod, chown and chgrp.
The man page for "stat(2)" discusses this.
2) How do I use "rsh" without having the rsh hang around until the
remote command has completed?
(See note in question 2.7 about what "rsh" we're talking about.)
The obvious answers fail:
rsh machine command &
or rsh machine 'command &'
For instance, try doing rsh machine 'sleep 60 &'
and you'll see that the 'rsh' won't exit right away.
It will wait 60 seconds until the remote 'sleep' command
finishes, even though that command was started in the
background on the remote machine. So how do you get
the 'rsh' to exit immediately after the 'sleep' is started?
The solution - if you use csh on the remote machine:
rsh machine -n 'command >&/dev/null </dev/null &'
If you use sh on the remote machine:
rsh machine -n 'command >/dev/null 2>&1 </dev/null &'
Why? "-n" attaches rsh's stdin to /dev/null so you could run the
complete rsh command in the background on the LOCAL machine.
Thus "-n" is equivalent to another specific "< /dev/null".
Furthermore, the input/output redirections on the REMOTE machine
(inside the single quotes) ensure that rsh thinks the session can
be terminated (there's no data flow any more.)
Note: The file that you redirect to/from on the remote machine
doesn't have to be /dev/null; any ordinary file will do.
In many cases, various parts of these complicated commands
aren't necessary.
3) How do I truncate a file?
The BSD function ftruncate() sets the length of a file. Xenix -
and therefore SysV r3.2 and later - has the chsize() system call.
For other systems, the only kind of truncation you can do is
truncation to length zero with creat() or open(..., O_TRUNC).
4) Why doesn't find's "{}" symbol do what I want?
"find" has a -exec option that will execute a particular
command on all the selected files. Find will replace any "{}"
it sees with the name of the file currently under consideration.
So, some day you might try to use "find" to run a command on every
file, one directory at a time. You might try this:
find /path -type d -exec command {}/\* \;
hoping that find will execute, in turn
command directory1/*
command directory2/*
...
Unfortunately, find only expands the "{}" token when it appears
by itself. Find will leave anything else like "{}/*" alone, so
instead of doing what you want, it will do
command {}/*
command {}/*
...
once for each directory. This might be a bug, it might be a feature,
but we're stuck with the current behaviour.
So how do you get around this? One way would be to write a
trivial little shell script, let's say "./doit", that
consists of
command "$1"/*
You could then use
find /path -type d -exec ./doit {} \;
Or if you want to avoid the "./doit" shell script, you can use
find /path -type d -exec sh -c 'command $0/*' {} \;
(This works because within the 'command' of "sh -c 'command' A B C ...",
$0 expands to A, $1 to B, and so on.)
or you can use the construct-a-command-with-sed trick
find /path -type d -print | sed 's:.*:command &/*:' | sh
If all you're trying to do is cut down on the number of times
that "command" is executed, you should see if your system
has the "xargs" command. Xargs reads arguments one line at a time
from the standard input and assembles as many of them as will fit into
one command line. You could use
find /path -print | xargs command
which would result in one or more executions of
command file1 file2 file3 file4 dir1/file1 dir1/file2
Unfortunately this is not a perfectly robust or secure solution.
Xargs expects its input lines to be terminated with newlines, so it
will be confused by files with odd characters such as newlines
in their names.
5) How do I set the permissions on a symbolic link?
Permissions on a symbolic link don't really mean anything. The
only permissions that count are the permissions on the file that
the link points to.
6) How do I "undelete" a file?
Someday, you are going to accidentally type something like "rm * .foo",
and find you just deleted "*" instead of "*.foo". Consider it a rite
of passage.
Of course, any decent systems administrator should be doing regular
backups. Check with your sysadmin to see if a recent backup copy
of your file is available. But if it isn't, read on.
For all intents and purposes, when you delete a file with "rm" it is
gone. Once you "rm" a file, the system totally forgets which blocks
scattered around the disk comprised your file. Even worse, the blocks
from the file you just deleted are going to be the first ones taken
and scribbled upon when the system needs more disk space. However,
never say never. It is theoretically possible *if* you shut down
the system immediately after the "rm" to recover portions of the data.
However, you had better have a very wizardly type person at hand with
hours or days to spare to get it all back.
Your first reaction when you "rm" a file by mistake is why not make
a shell alias or procedure which changes "rm" to move files into a
trash bin rather than delete them? That way you can recover them if
you make a mistake, and periodically clean out your trash bin. Two
points: first, this is generally accepted as a *bad* idea. You will
become dependent upon this behaviour of "rm", and you will find
yourself someday on a normal system where "rm" is really "rm", and
you will get yourself in trouble. Second, you will eventually find
that the hassle of dealing with the disk space and time involved in
maintaining the trash bin, it might be easier just to be a bit more
careful with "rm". For starters, you should look up the "-i" option
to "rm" in your manual.
If you are still undaunted, then here is a possible simple answer. You
can create yourself a "can" command which moves files into a
trashcan directory. In csh(1) you can place the following commands
in the ".login" file in your home directory:
alias can 'mv \!* ~/.trashcan' # junk file(s) to trashcan
alias mtcan 'rm -f ~/.trashcan/*' # irretrievably empty trash
if ( ! -d ~/.trashcan ) mkdir ~/.trashcan # ensure trashcan exists
You might also want to put a:
rm -f ~/.trashcan/*
in the ".logout" file in your home directory to automatically empty
the trash when you log out. (sh and ksh versions are left as an
exercise for the reader.)
MIT's Project Athena has produced a comprehensive
delete/undelete/expunge/purge package, which can serve as a
complete replacement for rm which allows file recovery. This
package was posted to comp.sources.misc (volume 17, issue 023-026)
7) How can a process detect if it's running in the background?
First of all: do you want to know if you're running in the background,
or if you're running interactively? If you're deciding whether or
not you should print prompts and the like, that's probably a better
criterion. Check if standard input is a terminal:
sh: if [ -t 0 ]; then ... fi
C: if(isatty(0)) { ... }
In general, you can't tell if you're running in the background.
The fundamental problem is that different shells and different
versions of UNIX have different notions of what "foreground" and
"background" mean - and on the most common type of system with a
better-defined notion of what they mean, programs can be moved
arbitrarily between foreground and background!
UNIX systems without job control typically put a process into the
background by ignoring SIGINT and SIGQUIT and redirecting the standard
input to "/dev/null"; this is done by the shell.
Shells that support job control, on UNIX systems that support job
control, put a process into the background by giving it a process group
ID different from the process group to which the terminal belongs. They
move it back into the foreground by setting the terminal's process group
ID to that of the process. Shells that do *not* support job control, on
UNIX systems that support job control, typically do what shells do on
systems that don't support job control.
8) Why doesn't redirecting a loop work as intended? (Bourne shell)
Take the following example:
foo=bar
while read line
do
# do something with $line
foo=bletch
done < /etc/passwd
echo "foo is now: $foo"
Despite the assignment ``foo=bletch'' this will print ``foo is now: bar''
in many implementations of the Bourne shell. Why?
Because of the following, often undocumented, feature of historic
Bourne shells: redirecting a control structure (such as a loop, or an
``if'' statement) causes a subshell to be created, in which the structure
is executed; variables set in that subshell (like the ``foo=bletch''
assignment) don't affect the current shell, of course.
The POSIX 1003.2 Shell and Tools Interface standardization committee
forbids the behaviour described above, i.e. in P1003.2 conformant
Bourne shells the example will print ``foo is now: bletch''.
In historic (and P1003.2 conformant) implementations you can use the
following `trick' to get around the redirection problem:
foo=bar
# make file descriptor 9 a duplicate of file descriptor 0 (stdin);
# then connect stdin to /etc/passwd; the original stdin is now
# `remembered' in file descriptor 9; see dup(2) and sh(1)
exec 9<&0 < /etc/passwd
while read line
do
# do something with $line
foo=bletch
done
# make stdin a duplicate of file descriptor 9, i.e. reconnect it to
# the original stdin; then close file descriptor 9
exec 0<&9 9<&-
echo "foo is now: $foo"
This should always print ``foo is now: bletch''.
Right, take the next example:
foo=bar
echo bletch | read foo
echo "foo is now: $foo"
This will print ``foo is now: bar'' in many implementations,
``foo is now: bletch'' in some others. Why?
Generally each part of a pipeline is run in a different subshell;
in some implementations though, the last command in the pipeline is
made an exception: if it is a builtin command like ``read'', the current
shell will execute it, else another subshell is created.
POSIX 1003.2 allows both behaviours so portable scripts cannot depend
on any of them.
9) How do I run 'passwd', 'ftp', 'telnet', 'tip' and other interactive
programs from a shell script or in the background?
These programs expect a terminal interface. Shells makes no special
provisions to provide one. Hence, such programs cannot be automated
in shell scripts.
The 'expect' program provides a programmable terminal interface for
automating interaction with such programs. The following expect
script is an example of a non-interactive version of passwd(1).
# username is passed as 1st arg, password as 2nd
set password [index $argv 2]
spawn passwd [index $argv 1]
expect "*password:"
send "$password\r"
expect "*password:"
send "$password\r"
expect eof
expect can partially automate interaction which is especially
useful for telnet, rlogin, debuggers or other programs that have no
built-in command language. The distribution provides an example
script to rerun rogue until a good starting configuration appears.
Then, control is given back to the user to enjoy the game.
Fortunately some programs have been written to manage the connection
to a pseudo-tty so that you can run these sorts of programs in a script.
To get expect, email "send pub/expect/expect.shar.Z" to
library@cme.nist.gov or anonymous ftp same from durer.cme.nist.gov.
Another solution is provided by the pty 4.0 program, which runs a
program under a pseudo-tty session and was posted to comp.sources.unix,
volume 25. A pty-based solution using named pipes to do the same as
the above might look like this:
#!/bin/sh
/etc/mknod out.$$ p; exec 2>&1
( exec 4<out.$$; rm -f out.$$
<&4 waitfor 'password:'
echo "$2"
<&4 waitfor 'password:'
echo "$2"
<&4 cat >/dev/null
) | ( pty passwd "$1" >out.$$ )
Here, 'waitfor' is a simple C program that searches for
its argument in the input, character by character.
A simpler pty solution (which has the drawback of not
synchronizing properly with the passwd program) is
#!/bin/sh
( sleep 5; echo "$2"; sleep 5; echo "$2") | pty passwd "$1"
10) How do I find out the process ID of a program with a particular
name from inside a shell script or C program?
In a shell script:
There is no utility specifically designed to map between program names
and process IDs. Furthermore, such mappings are often unreliable,
since it's possible for more than one process to have the same name,
and since it's possible for a process to change its name once it
starts running. However, a pipeline like this can often be used to
get a list of processes (owned by you) with a particular name:
ps ux | awk '/name/ && !/awk/ {print $2}'
You replace "name" with the name of the process for which you are
searching.
The general idea is to parse the output of ps, using awk or grep or
other utilities, to search for the lines with the specified name on
them, and print the PID's for those lines. Note that the "!/awk/"
above prevents the awk process for being listed.
You may have to change the arguments to ps, depending on what kind of
Unix you are using.
In a C program:
Just as there is no utility specifically designed to map between
program names and process IDs, there are no (portable) C library
functions to do it either.
However, some vendors provide functions for reading Kernel memory; for
example, Sun provides the "kvm_" functions, and Data General provides
the "dg_" functions. It may be possible for any user to use these, or
they may only be useable by the super-user (or a user in group "kmem")
if read-access to kernel memory on your system is restricted.
Furthermore, these functions are often not documented or documented
badly, and might change from release to release.
Some vendors provide a "/proc" filesystem, which appears as a
directory with a bunch of filenames in it. Each filename is a number,
corresponding to a process ID, and you can open the file and read it
to get information about the process. Once again, access to this may
be restricted, and the interface to it may change from system to
system.
If you can't use vendor-specific library functions, and you don't have
/proc, and you still want to do this completely in C, you are going to
have to do the grovelling through kernel memory yourself. For a good
example of how to do this on many systems, see the sources to
"ofiles", available in the comp.sources.unix archives.
(A package named "kstuff" to help with kernel grovelling was posted
to alt.sources in May 1991 and is also available via anonymous ftp as
usenet/alt.sources/articles/{329{6,7,8,9},330{0,1}}.Z from
wuarchive.wustl.edu.)
11) How do I check the exit status of a remote command
executed via "rsh" ?
This doesn't work:
rsh some-machine some-crummy-command || echo "Command failed"
The exit status of 'rsh' is 0 (success) if the rsh program
itself completed successfully, which probably isn't what
you wanted.
If you want to check on the exit status of the remote program, you
can try using Maarten Litmaath's 'ersh' script, which was posted to
alt.sources in January, 1991. ersh is a shell script that
calls rsh, arranges for the remote machine to echo the status
of the command after it completes, and exits with that status.
12) Is it possible to pass shell variable settings into an awk program?
There are two different ways to do this. The first involves simply
expanding the variable where it is needed in the program. For
example, to get a list of all ttys you're using:
who | awk '/^'"$USER"'/ { print $2 }' (1)
Single quotes are usually used to enclose awk programs because the
character '$' is often used in them, and '$' will be interpreted by
the shell if enclosed inside double quotes, but not if enclosed
inside single quotes. In this case, we *want* the '$' in "$USER"
to be interpreted by the shell, so we close the single quotes and
then put the "$USER" inside double quotes. Note that there are no
spaces in any of that, so the shell will see it all as one
argument. Note, further, that the double quotes probably aren't
necessary in this particular case (i.e. we could have done
who | awk '/^'$USER'/ { print $2 }' (2)
), but they should be included nevertheless because they are
necessary when the shell variable in question contains special
characters or spaces.
The second way to pass variable settings into awk is to use an
often undocumented feature of awk which allows variable settings to
be specified as "fake file names" on the command line. For
example:
who | awk '$1 == user { print $2 }' user="$USER" - (3)
Variable settings take effect when they are encountered on the
command line, so, for example, you could instruct awk on how to
behave for different files using this technique. For example:
awk '{ program that depends on s }' s=1 file1 s=0 file2 (4)
Note that some versions of awk will cause variable settings
encountered before any real filenames to take effect before the
BEGIN block is executed, but some won't so neither way should be
relied upon.
Note, further, that when you specify a variable setting, awk won't
automatically read from stdin if no real files are specified, so
you need to add a "-" argument to the end of your command, as I did
at (3) above.
13) How do I get rid of zombie processes that persevere?
From: jik@pit-manager.MIT.Edu (Jonathan I. Kamens)
Date: Fri, 17 Jan 92 14:40:09 -0500
Unfortunately, it's impossible to generalize how the death of child
processes should behave, because the exact mechanism varies over
the various flavors of Unix.
First of all, by default, you have to do a wait() for child
processes under ALL flavors of Unix. That is, there is no flavor
of Unix that I know of that will automatically flush child
processes that exit, even if you don't do anything to tell it to do
so.
Second, under some SysV-derived systems, if you do "signal(SIGCHLD,
SIG_IGN)" (well, actually, it may be SIGCLD instead of SIGCHLD, but
most of the newer SysV systems have "#define SIGCHLD SIGCLD" in the
header files), then child processes will be cleaned up
automatically, with no further effort in your part. The best way
to find out if it works at your site is to try it, although if you
are trying to write portable code, it's a bad idea to rely on this
in any case. Unfortunately, POSIX doesn't allow you to do this;
the behavior of setting the SIGCHLD to SIG_IGN under POSIX is
undefined, so you can't do it if your program is supposed to be
POSIX-compliant.
If you can't use SIG_IGN to force automatic clean-up, then you've
got to write a signal handler to do it. It isn't easy at all to
write a signal handler that does things right on all flavors of
Unix, because of the following inconsistencies:
On some flavors of Unix, the SIGCHLD signal handler is called if
one *or more* children have died. This means that if your signal
handler only does one wait() call, then it won't clean up all of
the children. Fortunately, I believe that all Unix flavors for
which this is the case have available to the programmer the wait3()
call, which allows the WNOHANG option to check whether or not there
are any children waiting to be cleaned up. Therefore, on any
system that has wait3(), your signal handler should call wait3()
over and over again with the WNOHANG option until there are no
children left to clean up.
On SysV-derived systems, SIGCHLD signals are regenerated if there
are child processes still waiting to be cleaned up after you exit
the SIGCHLD signal handler. Therefore, it's safe on most SysV
systems to assume when the signal handler gets called that you only
have to clean up one signal, and assume that the handler will get
called again if there are more to clean up after it exits.
On older systems, signal handlers are automatically reset to
SIG_DFL when the signal handler gets called. On such systems, you
have to put "signal(SIGCHILD, catcher_func)" (where "catcher_func"
is the name of the handler function) as the first thing in the
signal handler, so that it gets reset. Unfortunately, there is a
race condition which may cause you to get a SIGCHLD signal and have
it ignored between the time your handler gets called and the time
you reset the signal. Fortunately, newer implementations of
signal() don't reset the handler to SIG_DFL when the handler
function is called. To get around this problem, on systems that do
not have wait3() but do have SIGCLD, you need to reset the signal
handler with a call to signal() after doing at least one wait()
within the handler, each time it is called.
The summary of all this is that on systems that have wait3(), you
should use that and your signal handler should loop, and on systems
that don't, you should have one call to wait() per invocation of
the signal handler.
One more thing -- if you don't want to go through all of this
trouble, there is a portable way to avoid this problem, although it
is somewhat less efficient. Your parent process should fork, and
then wait right there and then for the child process to terminate.
The child process then forks again, giving you a child and a
grandchild. The child exits immediately (and hence the parent
waiting for it notices its death and continues to work), and the
grandchild does whatever the child was originally supposed to.
Since its parent died, it is inherited by init, which will do
whatever waiting is needed. This method is inefficient because it
requires an extra fork, but is pretty much completely portable.
14) How do I get lines from a pipe as they are written instead of only in
larger blocks.
From: jik@pit-manager.MIT.Edu (Jonathan I. Kamens)
Date: Sun, 16 Feb 92 20:59:28 -0500
The stdio library does buffering differently depending on whether
it thinks it's running on a tty. If it thinks it's on a tty, it
does buffering on a per-line basis; if not, it uses a larger buffer
than one line.
If you have the source code to the client whose buffering you want
to disable, you can use setbuf() or setvbuf() to change the
buffering.
If not, the best you can do is try to convince the program that
it's running on a tty by running it under a pty, e.g. by using the
"pty" program mentioned in question 3.9.
--
Ted Timar - tmatimar@nff.ncl.omron.co.jp
Omron Corporation, Shimokaiinji, Nagaokakyo-city, Kyoto 617, Japan